To establish lists of words with unexpected frequencies in long sequences,
for instance in a molecular biology context, one needs to
quantify the exceptionality of families of word frequencies in random
sequences. To this aim, we
study large deviation probabilities of multidimensional word counts
for Markov and hidden Markov models.
More specifically, we compute local Edgeworth expansions
of arbitrary degrees for multivariate partial sums of lattice valued
functionals of finite Markov chains. This yields sharp approximations of
the associated large deviation probabilities.
We also provide detailed simulations. These exhibit in particular
previously unreported periodic oscillations, for which we provide
theoretical explanations.